Count data

There are totally 5955 crime events updating to September, 2021.

Violation events

Of these crime events, 775 were judged as violation crime events.

Misdemeanor events

Of these crime events, 3169 were judged as misdemeanor crime events.

Felony events

Of these crime events, 2040 were judged as felony crime events.

Crime events v.s. Time

Number of cirime events each month

Basically, the number of crime events in each month has the comparably same value. Specifically, March has the highest events number, whereas February has the lowest events number; there is a overall increasing number in the number of crime events along with months.

sub_crime_month = 
  raw_sub_crime %>% 
  filter(start_date > "2021-01-01") %>% 
  select(start_date, start_time, crime_event, law_cat) %>% 
  mutate(start_date = substring(start_date, 1, 7)) 
  
plot_1 = 
  sub_crime_month %>% 
  group_by(start_date) %>% 
  summarise(event_num = n()) %>% 
  plot_ly(
    x = ~start_date, y = ~event_num, type = "bar"
  )

layout(plot_1, title = "Crime events over month", xaxis = list(title = "Month"), yaxis = list(title = "Number of Crime Events"))

Number of crime events each week

With the degree to weeks, the highest number of crime events appears in the eleventh week, which in the March. It appears that there would be a locally maximum point for around every ten weeks, and there also would be a sudden increase and decrease at the beginning and end of the year, accordingly.

sub_crime_week = 
  raw_sub_crime %>% 
  select(start_date, start_time, crime_event, law_cat) %>% 
  mutate(week = cut.Date(start_date, breaks = "1 week", labels = FALSE)) %>% 
  arrange(week) %>% 
  group_by(week) %>% 
  summarise(event_num = n())
  
plot_2 = 
  sub_crime_week %>% 
    plot_ly(
    x = ~week, y = ~event_num, type = "scatter", mode = "marker"
  )

layout(plot_2, title = "Crime events over weeks", xaxis = list(title = "Week"), yaxis = list(title = "Number of Crime Events"))

Top 5 crime events v.s. Occurrence time

For the most frequent crime events, they mainly happen in the afternoon, from 12:00 pm to 20:00 pm in the 9 months in 2021. Of these crime events, from 12:00 pm to 20:00 pm, crime mischief is the most frequent happening events. In New York City, criminal mischief includes intentionally damage, participation in the destruction of an abandoned building.

The third degree of assault is the second frequent crime events in that time interval, which includes intention to cause physical injury to another person.

In that time interval, of crime events, the second degree of harassment is the third frequently happening events, which includes the intention to harass, annoy or alarm some person and strike people in some manner or make physical contact with them.

crime_occ_time = 
  raw_sub_crime %>% 
  mutate(event_time = ordered(event_time, levels = c("2 AM","6 AM","10 AM","2 PM","6 PM","10 PM"))) %>% 
  filter(crime_event %in% c("criminal mischief & related of","assault 3 & related offenses","harrassment 2","grand larceny","dangerous drugs"))

plot_3 = 
  crime_occ_time %>% 
  ggplot(aes(x = event_time, fill = crime_event)) + 
  geom_histogram(stat = "count", width = 0.9, height = 2) + 
  labs(
    title = "Frequency of crime events v.s. Time points", 
    x = "Occurrence time", 
    y = "Frequency of crime events") + 
  theme_bw() + 
  theme(
    plot.title = element_text(hjust = 1), 
    legend.position = "bottom",
    legend.text = element_text(size = 8)) + 
  guides(col = guide_legend(nrow = 2))

ggplotly(plot_3) %>%
  layout(legend = list(
      orientation = "h",
      xanchor = "center",
      yanchor = "top",
      x = 0.3,
      y = - 0.3
    )
  )

Degrees of crime event

sub_crime_degree = 
  raw_sub_crime %>% 
  filter(crime_event %in% c("criminal mischief & related of","assault 3 & related offenses","harrassment 2","grand larceny","dangerous drugs","felony assault", "robbery", "petit larceny", "forgery", "sex crimes")) %>% 
  count(crime_event, law_cat)

plot_4 = 
  sub_crime_degree %>% 
    plot_ly(
    x = ~crime_event, y = ~n, color = ~law_cat, type = "bar"
  )

layout(plot_4, title = "Crime Events Numbers each degree", xaxis = list(title = "Crime events"), yaxis = list(title = "Number of Crime Events"))

Proceeding time

crime_prcd_time = 
  raw_sub_crime %>% 
  drop_na(start_time, end_time) %>%
  mutate(prcd_time = difftime(end, start, units = "mins")) %>% 
  filter(prcd_time < 35) %>% 
  filter(prcd_time != 0) %>% 
  mutate(quarters = quarters(as.Date(start_date)))

plot_5 = 
  crime_prcd_time %>% 
  plot_ly(y = ~ prcd_time, color = ~ law_cat, type = "box")

layout(plot_5, title = "Crime type", xaxis = list(title = "Proceeding time"), yaxis = list(title = "Crime type v.s. Proceeding time (mins)")
    )

Day of week v.s. Occurrence time

sub_crime_dow = 
  raw_sub_crime %>% 
  mutate(day_of_week = wday(as.Date(start_date), label=TRUE, abbr = FALSE)) %>% 
  mutate(day_of_week = fct_relevel(day_of_week, "Saturday", "Friday", "Thursday", "Wednesday", "Tuesday", "Monday", "Sunday")) %>% 
  separate(start_time, into = c("hour", "minute", "second"), sep = ":") %>% 
  select(day_of_week, hour, crime_event) %>% 
  group_by(day_of_week, hour) %>% 
  summarise(crime_num = n())

plot_6 = 
  sub_crime_dow %>% 
  plot_ly(
    x = ~ hour, y = ~ day_of_week, z = ~ crime_num, type = "heatmap", colors = "BuPu"
  ) %>%
  colorbar(title = "Events Number", x = 1.1, y = 0.8) 

layout(plot_6, title = "Crime frequency: Day v.s. Hour", xaxis = list(title = "Hour"), yaxis = list(title = "Day of week")
    )